最近要去當兵,所以在進去前多少學一下資料庫。
計算Software Engineer工程師的數量。
mysql> SELECT COUNT(*) FROM employee
-> WHERE title="Software Engineer";
+----------+
| COUNT(*) |
+----------+
| 3 |
+----------+
1 row in set (0.01 sec)
將title做分類。
mysql> SELECT DISTINCT title FROM employee;
+------------------------+
| title |
+------------------------+
| Software Engineer |
| Software Architect |
| Database Administrator |
| Project Manager |
| Test Engineer |
+------------------------+
5 rows in set (0.01 sec)
搭配count,計算總共有幾種title。
mysql> SELECT count(DISTINCT title) FROM employee;
+-----------------------+
| count(DISTINCT title) |
+-----------------------+
| 5 |
+-----------------------+
1 row in set (0.01 sec)
以title,last_name去分類並組成群組,可發現全部都各為一個群組,共分為9組。
mysql> SELECT title, last_name,count(title) FROM employee
-> GROUP BY title,last_name;
+------------------------+-----------+--------------+
| title | last_name | count(title) |
+------------------------+-----------+--------------+
| Database Administrator | Clifford | 1 |
| Database Administrator | Dickens | 1 |
| Project Manager | Clifford | 1 |
| Software Architect | Edward | 1 |
| Software Architect | Gilbert | 1 |
| Software Engineer | Clifford | 1 |
| Software Engineer | Jackman | 1 |
| Software Engineer | Newman | 1 |
| Test Engineer | Chan | 1 |
+------------------------+-----------+--------------+
9 rows in set (0.00 sec)
以title去分類並組成群組,共為5組,且有些組不只一筆資料。
mysql> SELECT title, last_name,count(title) FROM employee
-> GROUP BY title;
+------------------------+-----------+--------------+
| title | last_name | count(title) |
+------------------------+-----------+--------------+
| Database Administrator | Dickens | 2 |
| Project Manager | Clifford | 1 |
| Software Architect | Edward | 2 |
| Software Engineer | Jackman | 3 |
| Test Engineer | Chan | 1 |
+------------------------+-----------+--------------+
5 rows in set (0.00 sec)
而如果只是需要過濾掉重複的資料,那麼可以使用DISTINCT且比較不占用效能。
但如果非純粹去重複,且不考慮效能問題,且GROUP BY也可以處理較複雜的邏輯,所以建議盡量使用GROUP BY
。
而我們也可以搭配GROUP BY,在以title區分為不同群組後,對不同組的資料取其最高之薪水。
mysql> SELECT title,max(salary) FROM employee
-> GROUP BY title;
+------------------------+-------------+
| title | max(salary) |
+------------------------+-------------+
| Database Administrator | 6800 |
| Project Manager | 8500 |
| Software Architect | 8000 |
| Software Engineer | 5500 |
| Test Engineer | 6500 |
+------------------------+-------------+
5 rows in set (0.00 sec)
mysql> SELECT title,min(salary) FROM employee
-> GROUP BY title;
+------------------------+-------------+
| title | min(salary) |
+------------------------+-------------+
| Database Administrator | 6000 |
| Project Manager | 8500 |
| Software Architect | 7200 |
| Software Engineer | 4750 |
| Test Engineer | 6500 |
+------------------------+-------------+
5 rows in set (0.01 sec)
計算出各群組的薪水總和。
mysql> SELECT title,sum(salary) FROM employee
-> GROUP BY title;
+------------------------+-------------+
| title | sum(salary) |
+------------------------+-------------+
| Database Administrator | 12800 |
| Project Manager | 8500 |
| Software Architect | 15200 |
| Software Engineer | 15350 |
| Test Engineer | 6500 |
+------------------------+-------------+
5 rows in set (0.00 sec)
計算出各群組的薪水總和及平均值。
mysql> SELECT title
-> ,SUM(salary)
-> ,AVG(salary)
-> FROM employee
-> GROUP BY title;
+------------------------+-------------+-------------------+
| title | SUM(salary) | AVG(salary) |
+------------------------+-------------+-------------------+
| Database Administrator | 12800 | 6400 |
| Project Manager | 8500 | 8500 |
| Software Architect | 15200 | 7600 |
| Software Engineer | 15350 | 5116.666666666667 |
| Test Engineer | 6500 | 6500 |
+------------------------+-------------+-------------------+
5 rows in set (0.01 sec)
與where不同在於,where是對GROUP BY之前的資料進行過濾,也就是全部的TABLE,而HAVING是針對GROUP BY之後的資料進行過濾。但大多還是使用where居多。
過濾GROUP BY之後指定title為Software Engineer的資料。
mysql> SELECT title,
-> count(*),
-> AVG(salary)
-> FROM employee
-> GROUP BY title
-> HAVING title="Software Engineer";
+-------------------+----------+-------------------+
| title | count(*) | AVG(salary) |
+-------------------+----------+-------------------+
| Software Engineer | 3 | 5116.666666666667 |
+-------------------+----------+-------------------+
1 row in set (0.01 sec)
最後來個綜合練習
將導演去重複並分組後,依照票房高低作排序(降冪),限制在五筆資料。
mysql> SELECT
-> director_name,
-> SUM(gross)
-> FROM `movie`
-> GROUP BY
-> director_name
-> ORDER BY
-> SUM(gross) DESC
-> LIMIT
-> 5;
+-------------------+-------------+
| director_name | SUM(gross) |
+-------------------+-------------+
| Christopher Nolan | 38012181818 |
| James Cameron | 36898631874 |
| Joss Whedon | 28139049796 |
| Peter Jackson | 26897421538 |
| Michael Bay | 25996453730 |
+-------------------+-------------+
5 rows in set (0.00 sec)
mysql> SELECT
-> director_name,
-> count(director_name)
-> FROM `movie`
-> GROUP BY
-> director_name
-> ORDER BY
-> count(director_name) DESC
-> LIMIT
-> 5
->
-> ;
+-------------------+----------------------+
| director_name | count(director_name) |
+-------------------+----------------------+
| Christopher Nolan | 84 |
| Peter Jackson | 84 |
| Bryan Singer | 84 |
| Gore Verbinski | 63 |
| Sam Raimi | 63 |
+-------------------+----------------------+
5 rows in set (0.00 sec)